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ABSTRACT 

We present the results of a machine-learning (ML) based search for new R Coronae Borealis (RGB) 
stars and DY Persei-like stars (DYPers) in the Galaxy using cataloged light curves from the All-Sky 
Automated Survey (AS AS) Catalog of Variable Stars (ACVS). RGB stars — a rare class of hydrogen- 
deficient carbon-rich supergiants — are of great interest owing to the insights they can provide on the 
late stages of stellar evolution. DYPers are possibly the low-temperature, low-luminosity analogs 
to the RGB phenomenon, though additional examples are needed to fully establish this connection. 
While RGB stars and DYPers are traditionally identified by epochs of extreme dimming that occur 
without regularity, the ML search framework more fully captures the richness and diversity of their 
photometric behavior. We demonstrate that our ML method can use newly discovered RGB stars 
to identify additional candidates within the same data set. Our search yields 15 candidates that 
we consider likely RGB stars/DYPers: new spectroscopic observations confirm that four of these 
candidates are RGB stars and four are DYPers. Our discovery of four new DYPers increases the 
number of known Galactic DYPers from two to six; noteworthy is that one of the new DYPers has 
a measured parallax and is to « 7 mag, making it the brightest known DYPer to date. Future 
observations of these new DYPers should prove instrumental in establishing the RGB connection. 
We consider these results, derived from a machine-learned probabilistic classification catalog, as an 
important proof-of-concept for the efficient discovery of rare sources with time-domain surveys. 

Subject headings: circumstellar matter - methods: data analysis - stars: carbon - stars: evolution — 
stars: variables: other - techniques: photometric 



1. INTRODUCTION 

R Coronae Borealis (RGB) stars are hydrogen-deficient 
carbon (HdC) stars that exhibit spectacular (Amy up to 
^8 mag) , aperiodic declines in brightness (for a review on 
RGB stars see Clayton 1996). The fading occurs rapidly 
(~1 to few weeks) as new dust is formed in the circum- 
stellar environment, and the recovery is slow, sometimes 
taking several years, as the new dust is dispersed and 
removed from the line of sight. At maximum light RGB 
stars are bright supergiants, which in combination with 
the large-amplitude photometric variability should make 
them easy to discover. Yet, to date there are only ~56 
known RGB stars in the Galaxy (Clayton 1996; Clayton 
et al. 2002; Zaniewski et al. 2005; Tisserand et al. 2008; 
Clayton ct al. 2009; Kijbimchoo ct al. 2011). The rar- 
ity of these stars suggests that they reflect a very brief 
phase of stellar evolution, or a bias in RGB star search 
methods, or both. 

The lack of hydrogen and overabundance of carbon 
in RGB atmospheres implies that RGB stars are in a 
late stage of stellar evolution, but no consensus has yet 
emerged regarding their true physical nature. There are 
two leading theories for explaining the observed proper- 
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ties of RGB stars: the Double Degenerate (DD) scenario 
and the final Helium shell flash (EE) scenario (see e.g., 
Ibcn et al. 1996). The DD scenario posits that RGB 
stars are the stellar remnant of a carbon-oxygen white 
dwarf (WD) and helium WD merger. In the FF sce- 
nario, a thin layer of He in the interior of the star begins 
runaway burning, which leads to the rapid expansion of 
the photosphere shortly before the star becomes a plan- 
etary nebula. There are observational properties of RGB 
stars that both theories have difficulty explaining (Clay- 
ton 1996), and conflicting observational evidence sup- 
porting aspects of both (e.g., Clayton et al. 2007; Pandey 
ct al. 2008; Clayton et al. 2006, 2011). If, as some of 
the recent observations suggest, the DD scenario proves 
correct, then a complete census of Galactic RGB stars 
should be able to calibrate population synthesis mod- 
els of WD binary systems (e.g., Nelcmans et al. 2001), 
which may improve our understanding of these systems 
as the progenitors of Type la supernovae. In any event, 
the enigmatic nature of these rare objects, and the op- 
portunity to elucidate the astrophysics of an important 
late stage of stellar evolution, motivates us to search for 
additional benchmark exemplars of the class. 

Based on the detection of RGB stars in the Large Mag- 
ellanic Cloud (LMC), it is argued in Alcock ct al. (2001) 
that there should be ^3200 RGB stars in the Galaxy. 
With the actual number of known RGB stars in the Milky 
Way roughly two orders of magnitude below this esti- 
mate, this suggests that either thousands of RGB stars 
remain undetected or the differing star formation envi- 
ronments/histories in the LMC and the Milky Way result 
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in highly different RGB populations. An observational 
bias that preferentially selects warm RGB stars likely 
contributes to the discrepancy between the predicted 
and known number of these stars in the Galaxy (Law- 
son et al. 1990). Indeed, recent discoveries of RGB stars 
in the Galactic bulge and Magellanic Glouds (MGs) have 
uncovered more cool, Teg ~5000 K, rather than warm. 
Toff ~7000 K, RGB stars (Alcock et al. 2001; Zaniewski 
et al. 2005; Tisserand et al. 2008, 2009). The observed 
correlation between color and My, with bluer RGB stars 
in the MGs being more luminous (Alcock ct al. 2001; Tis- 
serand et al. 2009), clearly shows that any magnitude- 
limited survey will have an observational bias towards 
discovering the intrinsically rarer warm RGB stars. 

There may also be a large population of RGB stars 
that have colder photospheres than the cool RGB stars: 
there is one known Galactic RGB star, DY Persei (Alk- 
snis 1994), that has Tcff ^3500 K (Kccnan & Barnbaum 
1997). Recent observations of the MGs have identified 
several DY Persei-like stars (DYPers) while searching for 
RGB stars (Alcock et al. 2001; Tisserand et al. 2009; 
Soszyhski et al. 2009), while Tisserand et al. (2008) dis- 
covered the second known DYPer in the Milky Way us- 
ing observations of the Galactic bulge. In addition to 
cooler photospheres, DYPers have other properties that 
differ from RGB stars, which has led to some degree of 
ambiguity regarding the connection between these two 
classes (see e.g., Alcock ct al. 2001; Tisserand ct al. 2009; 
Soszyhski et al. 2009). 

DYPers and RGB stars both show an overabundance 
of carbon in their atmospheres and unpredictable, large- 
amplitude declines in their light curves. Several prop- 
erties differ between the two, however, for instance, 
DYPers: (i) have symmetric declines in their light curves, 
(ii) clearly show ^'^G in their spectra, (iii) are on average 
^10 times fainter than RGB stars, and (iv) may have sig- 
nificant H in their atmospheres. A detailed examination 
of the differences in the mid-infrared excesses of RGB 
stars and DYPers in the MGs led to the conclusion in 
Tisserand et al. (2009) that DYPers are most likely nor- 
mal carbon stars that experience ejection events rather 
than an extension of the RGB phenomenon to lower tem- 
perature stars. Furthermore, using OGLE-III observa- 
tions, it is shown in Soszyiiski et al. (2009) that sev- 
eral carbon-rich asymptotic giant branch stars (AGBs), 
which have been classified as Mira or semi-regular pe- 
riodic variables on the basis of their light curves, show 
evidence for DYPer-like declines in their light curves.^ 
This leads to the conclusion in Soszyiiski et al. that 
DYPers are heavily enshrouded carbon-rich AGB stars 
that are an extension of typical variables rather than a 
separate class of variable stars. Nevertheless, all stud- 
ies of DYPers to date have cited a need for more obser- 
vations, in particular high resolution spectra to conduct 
detailed abundance analyses, to confirm or deny the pos- 
sibility that DYPers are the low temperature analogs to 
RGB stars. 

Over the past decade the decrease in the cost of large 
GGDs, coupled with a dramatic increase in computer 

® We note that the sources included in the study of Soszyiiski 
et al. (2009) are photometrically classified as carbon AGB. Thus, 
the candidates in that study require spectroscopic observations in 
order to be confirmed as DYPers. 



processing power and storage capabilities, has enabled 
several wide-field, time-domain surveys. These surveys 
will continue to produce larger data sets before culminat- 
ing near the end of the decade with the Large Synoptic 
Survey Telescope (LSST; Ivezic et al. 2008). This explo- 
sion of observations should enable the discovery of the 
thousands of "missing" Galactic RGB stars, should they 
in fact exist. These new discoveries do not come without 
a cost, however, as the data rates of astronomical surveys 
are now becoming enormous. While it was once feasible 
for humans to visually examine the light curves of all the 
newly discovered variable stars, as the total number of 
photometric variables grows to 10^-10^ visual inspection 
by expert astronomers becomes intractable. 

Advanced software solutions, such as machine-learning 
(ML) algorithms, are required to analyze the vast 
amounts of data produced by current and upcoming 
time-domain surveys. In an ML approach to classifi- 
cation, data from sources of known science class are 
employed to train statistical algorithms to automati- 
cally learn the distinguishing characteristics of each class. 
These algorithms generate an optimal predictive model 
that can determine the class (or posterior class probabil- 
ity) of a new source given its observed data.^ Richards 
ct al. (2011) presented an end-to-end ML framework 
for multi-class variable star classification, in which they 
describe algorithms for feature generation from single- 
band light curves and outline a methodology for non- 
parametric, multi-class statistical classification. 

In this paper we present the results of a search for new 
RGB stars and DYPers in the Galaxy using version 2.3 of 
the ML catalog presented in Richards et al. (2012b). In 
§2 we describe the candidate selection procedure, while 
§3 describes the new and archival observations of the 
candidates. Our analysis of the photometric and spec- 
troscopic data is contained in §4. The individual stars 
are examined in further detail in §5, while we discuss the 
results in §6. Our conclusions are presented in §7. 

2. CANDIDATE SELECTION 

2.1. Advantages of Machine- Learning Classification 

Gandidate selection of possible RGB stars was per- 
formed using version 2.3 of the machine-learned AGVS 
classification catalog (MAGG; Richards et al. 2012b) of 
variable sources cataloged from All-Sky Automated Sur- 
vey (ASAS; Pojmanski 1997, 2001). Full details of the 
classification procedure can be found in Richards et al. 
(2012a) and Richards et al. (2012b). Briefly, we employ a 
Random Forest (RF) classifier, which has been shown to 
provide the most robust results for variable star classifi- 
cation (see e.g., Richards et al. 2011; Dubath ct al. 2011), 
to provide probabilistic classifications for all of the 50,124 
sources in ASAS Gatalog of Variable Stars (AGVS; Po- 
jmanski 2000). The classification procedure proceeds as 
follows: for each source in the AGVS 71 features are com- 
puted, 66 from the ASAS light curves (e.g., period, am- 
plitude, skew, etc.; for the full list of features we refer the 
reader to Richards et al. 2012b and references therein) 
and 5 color features from optical and near-infrared (NIR) 
catalogs. A training set, upon which the RF classifica- 
tions will be based, is constructed using light curves from 

^ For a primer on machine learning, we refer the reader to Hastie 
et al. (2009). 
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28 separate science classes, most of which are defined 
using well studied stars with high precision light curves 
from the Hipparcos and OGLE surveys (Dcbosschcr ct al. 
2007; Richards ct al. 2011), as well as some visually clas- 
sified sources from ACVS for a few of the classes that are 
not well represented in Hipparcos (Richards et al. 2012a). 
The same 71 features are calculated for all the sources in 
the training set, and the RF classifier uses the separation 
of the 28 science classes in the multi-dimensional feature 
space to assign probabilistic classifications to each source 
in the AGVS. In the end, the probability of belonging to 
each individual science class is provided for each AGVS 
source and a post-RF procedure is used to calibrate these 
probabilities (meaning that a source with P(Mira) = 0.5 
has a ~50% chance of actually being a Mira). 

When searching for RGB stars in time-domain survey 
data, RF classification provides a number of advantages 
relative to the more commonly used method of placing 
hard cuts on a limited set of a few features. Many studies 
have focused on light curves with large amplitude vari- 
ations and a lack of periodic signal (e.g., Alcock et al. 
2001; Zaniewski et al. 2005; Tisscrand et al. 2008). A 
few recent studies have noted that additional cuts on NIR 
and mid-infrared colors can improve selection efficiency 
(Soszyiiski ct al. 2009; Tisscrand et al. 2011; Tisscrand 
2012). While these surveys have all proven successful, 
the use of hard cuts may eliminate actual RGB stars from 
their candidate lists. 

Hard cuts are not necessary, however, when using a 
multi-feature RF classifier, which is capable (in prin- 
ciple) of capturing most of the photometric behavior 
of RGB stars (including the large-amplitude, aperiodic 
fades from maximum light as well as the periodic varia- 
tions that occur near maximum light). Another general 
disadvantage in the use of hard cuts for candidate selec- 
tion of rare sources is that the hard cuts are typically 
defined by known members of the class of objects for 
which the search is being conducted. Any biases present 
in the discovery of the known members of a particular 
class will then be encoded into the absolute (i.e., hard 
cuts) classification schema. This can exclude subclasses 
of sources that differ slightly from the defining members 
of a class. Furthermore, new discoveries will be unable to 
refine the selection criteria since, by construction, they 
will fall within the same portion of feature space as pre- 
viously known examples. 

The RF classifier produces an estimate of the poste- 
rior probability that a source is an RGB star given its 
light curve and colors. This allows us to construct a rel- 
ative ranking of the RGB likelihood for all the sources in 
AGVS. Instead of making cuts in feature space, we can 
search down the ordered list of candidates. In this sense 
the RF classifier identifies the sources that are closest to 
the RGB training set relative to the other classes. The 
RF classifier finds the class boundaries in a completely 
data-driven way, allowing for the optimal use of known 
objects to search for new candidates in multi-dimensional 
feature spaces. This helps to mitigate against biases 
present in the training set, as classifications are per- 
formed using the location of an individual source in the 
multidimensional feature phase-space volume relative to 
defined classes in the training set. 



2.2. The Training Set 

The MAGG RGB training set was constructed us- 
ing high-confidence positional matches between AGVS 
sources and known RGB stars identified in SIMBAD** 
and the literature. In total there are 18 cataloged RGB 
stars that are included in the AGVS, which we summa- 
rize in Table 1. The light curves of the known RGB stars 
were visually examined for the defining characteristic of 
the class: sudden, aperiodic drops in brightness followed 
by a gradual recovery to pre-decline flux levels. All of the 
known RGB stars but one, ASAS 054503-6424.4, showed 
evidence for such behavior. ASAS 054503-6424.4 is 
one of the brightest RGB stars in the LMG (l^nax ~ 
13.75 mag), which during quiescence is barely above the 
ASAS detection threshold. The light curve for ASAS 
054503—6424.4 does not show a convincing decline from 
maximum light, and as such we do not include it in the 
training set. 

In addition to the 18 RGB stars in AGVS, 7 addi- 
tional RGB stars are detected in ASAS with the char- 
acteristic variability of the class. ^ These sources all have 
clearly variable ASAS light curves; their exclusion from 
the AGVS means there is some bias in the construction 
of that catalog. In order to keep this bias self-consistent 
the training set for the MAGG only included sources from 
Richards et al. (2011) and supplements from AGVS (see 
Richards et al. 2012a). We note that a future paper to 
classify all ^-^12 million sources detected by ASAS will 
include all ASAS RGB stars in its training set (Richards 
et al., in prep). Therefore the training set includes 17 
RGB stars, which is limited by the coverage and depth 
of ASAS, the selection criteria of the AGVS, and the 
paucity of known RGB stars in the Galaxy. There are 
no known DYPers in AGVS: only two are known in the 
Galaxy and the DYPers in the MGs are fainter than the 
ASAS detection limits. Nevertheless, the similarity in 
the photometric behavior of RGB stars and DYPers al- 
lows us to use the RGB training set to search for both 
types of star. As more Galactic RGB stars and DYPers 
are discovered, we will be able to supplement the train- 
ing set and improve the ability of future iterations of the 
RF classifier (see §6.2). 

In order to determine our ability to recover known 
RGB stars using the RF classifier we perform a leave- 
one-out cross validation (GV) procedure. For the 17 
sources in the RGB training set, we remove one source 
and re-run the RF classifier in an identical fashion to 
that used in Richards et al. (2012b). We then record the 
RF-determined probability that the removed source be- 
longs to the RGB class, P(RGB), and the ranked value 
of P(RGB) relative to all other stars that are not in- 
cluded in the training set, i?(RGB). We repeat the GV 
procedure for each star included in the training set, and 
the results are shown in Table 1. Since the training set 
is being altered in each run of the GV, i?(RGB) pro- 
vides a better measure of the quality of each candidate; 
i?(RGB) is a relative quantity, whereas the calibration of 
P(RGB) will differ slightly from run to run. Eight of the 
17 sources in the training set have i?(RGB) < 3, imply- 
ing that ^50% of the training set would be a top three 

* http : / /simbad .u-strasbg. f r/simbad/ 

9 They are: SU Tau, UX Ant, UW Cen, V348 Sgr, GU Sgr, RY 
Sgr, and V532 Oph. 
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TABLE 1 
Known RGB Stars in AC VS. 
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candidate RGB star had we previously not known about 
it. Fifteen of the 17 RGB stars in the training set would 
be in the top 0.8% of the 50,124 sources in the AGVS, 
while ah the known RGB stars in AGVS, including ASAS 
054503—6424.4 (which is not in the training set), are in 
the top ^6% of RGB candidates. Two sources in the 
training set, SV Sge and ES Aql, are not listed near the 
top of i?(RGB) ranking during GV. For ES Aql this oc- 
curs because the star is highly active during the ASAS 
observations showing evidence for at least six separate 
declines during the ^10 yr observing period. As a re- 
sult the light curve folds fairly well on a period of ~397 
day, and ES Aql becomes confused with Mira and semi- 
regular periodic variables (see Figure 2). SV Sge, on the 
other hand, shows significant periodicity at the parasite 
frequency of 1 day, which precludes it from having a high 
i?(RGB). The GV procedure allows us to roughly tune 
the efficiency of our selection criteria; the purity of the 
selection criteria cannot be evaluated until candidates 
have been spectroscopically confirmed. 

2.3. The Candidates 

Due to the relative rarity of RGB stars, we elected to 
generate a candidate list with high efficiency while sac- 
rificing the possibility of high purity. With only ^-^50 
Galactic RGB stars known to date, every new discovery 
has the potential to add to our knowledge of their popu- 
lation and characteristics. To generate our candidate list, 
we selected all sources from the MAGG with P(RGB) > 
0.1, which resulted in a total of 472 candidates. The se- 
lection criterion was motivated by the GV experiment, 
which indicates that our candidate list should have an 
efficiency >80%. To obtain an efficiency close to 1 would 
require visual examination of roughly 3000 sources. 

Since the expected purity of our sample is small by de- 
sign, we examine the light curves of all sources within our 
candidate list by eye to remove sources that are clearly 



not RGB stars. These interlopers are typically semi- 
regular pulsating variables or Mira variables, often with 
minimum brightness levels below the detection thresh- 
old. We use the ALLSTARS Web interface (Richards et al. 
2012a) to examine candidates, which in addition to light 
curves provides summary statistics (period, amplitude, 
color, etc.) for each source, as well as links to external 
resources, such as SIMBAD. We also remove any sources 
from the candidate list that are spectroscopically con- 
firmed as non-carbon stars. 

Following the removal of these stars the candidate list 
was culled from 472 to 15 candidates we considered likely 
RGB stars, for which we obtained spectroscopic follow- 
up observations. The general properties of the 15 spec- 
troscopically observed candidates, including their names, 
coordinates, and RF probabilities, are summarized in Ta- 
ble 2. Finding charts using images from the Digitized Sky 
Survey^" (DSS) for the spectroscopically confirmed RGB 
and DYPer candidates can be found in Figure 1. Six 
of the selected candidates for spectroscopic observations 
are known carbon stars listed in the General Gatalog of 
Galactic Garbon Stars (GGGS; Alksnis et al. 2001; see 
Table 2). 

2.4. Feature Importance 

RF classifiers can provide quantitative feedback about 
the relative importance of each feature used for classi- 
fication. The RF feature importance measure describes 
the decrease in the overall classifier performance follow- 
ing the replacement of a single feature with a random 
permutation of its values (see Brciman 2001 for further 
details) . We measure the importance of each feature us- 
ing the average importance from a one-versus-one classi- 
fier whereby the RGB class is iteratively classified against 
each of the 27 other science classes on an individual ba- 

^'^ http://stdatu.stsci.edu/dss/ 



New RGBs and DYPers from ACVS 5 



TABLE 2 

RGB Candidates with P(RCB) > 0.1 from the MACC. 



Name 


Other ID 


MAGG'' 


'^J2000.o'' 


'5,12000. o'' 


GGGS<= 


P(RGB) 


fl(RGB)d 


R/D/N<= 






ID 


(hh mm ss.ss) 


(dd mm ss.s) 


ID 








ASAS 060105+1654.7 


V339 Ori 


220556 


06 01 04.65 


+16 54 40.8 




0.466 


25 


N 


ASAS 065113+0222.1 


C* 596 


223100 


06 51 13.31 


+02 22 08.6 


1429 


0.302 


73 


D 


ASAS 073456-2250.1 


V455 Pup 


225801 


07 34 56.24 


-22 50 04.2 


1782 


0.123 


283 


N 


ASAS 095221-4329.7 


IRAS 09503-4315 


232170 


09 52 21.37 


-43 29 40.5 




0.617 


10 


N 


ASAS 153214-2854.4 


BX Lib 


242289 


15 32 13.48 


-28 54 21.6 




0.367 


43 


N 


ASAS 162229-4835.7 


lO Nor 


244409 


16 22 28.84 


-48 35 55.8 




0.950 


1 


R 


ASAS 162232-5349.2 


C* 2322 


244411 


16 22 32.08 


-53 49 15.6 


3685 


0.391 


36 


D 


ASAS 165444-4925.9 


C* 2377 


245841 


16 54 43.60 


-49 25 55.0 


3744 


0.490 


22 


R 


ASAS 170541-2650.1 


GV Oph 


246478 


17 05 41.25 


-26 50 03.4 




0.702 


8 


R 


ASAS 180823-4439.8 


V496 GrA 


251092 


18 08 23.05 


-44 39 46.7 




0.110 


389 


N 


ASAS 182658+0109.0 


G* 2586 


252675 


18 26 57.64 


+01 09 03.1 


4013 


0.115 


343 


D 


ASAS 185817-3543.8 


IRAS 18549-3547 


255280 


18 58 17.19 


-35 43 44.7 




0.127 


251 


N 


ASAS 191909-1554.4 


V1942 Sgr 


256869 


19 19 09.60 


-15 54 30.1 


4229 


0.543 


17 


D 


ASAS 194245-2137.0 




258411 


19 42 45.05 


-21 36 59.8 




0.112 


376 


N 


ASAS 203005-6208.0 


NSV 13098 


261023 


20 30 04.96 


-62 07 59.2 




0.340 


52 


R 



Note. — This tabic contains only those sources which were selected for spectroscopic follow-up following visual inspection of their light curves 
^ DotAstro ID: internal designation for the MACC. 

^ Reported coordinates from the Two Micron All Sky Survey point source catalog (Cutri ot al. 2003). 
^ ID from the General Catalog of Galactic Carbon Stars (CGCS; Alksnis et al. 2001). 

^ Relative rank of J^(RCB) including all sources from version 2.3 of the MACC not in the RCB training set. 

^ Flag indicating classification of the source following spectroscopic observations: /?: RCB, D: DYPer, N: Neither. 



sis. This procedure is run five times and the average 
of all runs is taken to reduce the variance present in 
any single run. Unsurprisingly, we find that amplitude 
is the most important feature. The importance mea- 
sure does not properly capture the covariance between 
features and as a result the majority of the important 
features have to do with amplitude. The second and 
third most important features that are not highly covari- 
ant with amplitude are qso_log_chi2nuNULL_chi2nu/^ 
a measure of the dissimilarity between the photomet- 
ric variations of the source and a typical quasar, and 
f reql_harmonics_f req_0, the best fit period. Interest- 
ingly, f req_signif , the significance of the best fit period 
of the light curve, ranks as only the 31st most important 
out of the 71 features. 

We summarize the results of these findings with two- 
dimensional cuts through the multi-dimensional fea- 
ture space showing amplitude versus period significance, 
Xqso Falser ^^'^ pcriod in Figure 2. We also show ampli- 
tude versus P(RCB). In each panel we show the location 
of the RCB stars in the training set as well as the newly 
discovered RGBs and DYPers presented in this paper, 
and we use the P(RCB) values from the CV experiment 
from §2.2 for the RCB stars in the training set. We 
also show the location of cuts necessary to achieve ~80% 
efficiency (blue dashed line) when selecting candidates 
using only two features, as well as the cuts necessary to 
achieve ~100% efficiency (red dashed line). As would be 
expected based on the results presented above, it is clear 
that Xqso False period are far more discriminating 
than period significance when selecting RCB candidates. 
To achieve an efficiency near 100%, P(RCB) is vastly su- 
perior to any two dimensional slice through feature space. 
We note that the discretization seen in the distribution 
of P(RCB) is the result of using a finite number of trials 
within the RF classifier. The probability of belonging to 

This is the same as the Xqso False statistic, which is defined 
in Butler & Bloom (2011). 



a class is defined as the total number of times a source is 
classified within that class divided by the total number of 
trials. These discrete values are then smeared following 
the calibration procedure described in §2.1. 

Many of the known and new RCB stars have very simi- 
lar measured best periods clustered near ~2400 and 5300 
days, which for each corresponds to the largest period 
searched during the Lomb-Scargle analysis in Richards 
et al. (2012b). Folding these light curves on the adopted 
periods clearly shows that they are not periodic on the 
adopted periods, despite the relatively high period signif- 
icance scores (see the upper left panel of Figure 2), which 
suggests some peculiarity in the feature generation pro- 
cess for these sources. We are exploring improved metrics 
for periodicity to be used in future catalogs. Neverthe- 
less, despite these spurious period measurements, the ML 
classifier has correctly identified that this feature tends 
to be erroneous for RCB stars, and as such it is a pow- 
erful discriminant for finding new examples of the class. 

3. ARGHIVAL DATA AND NEW OBSERVATIONS 

3.1. ASAS Photometry 

All optical photometric observations were obtained 
during ASAS-3, which was an extension of ASAS, con- 
ducted at the Las Campanas Observatory (for further de- 
tails on ASAS and ASAS-3 see Pojmahski 1997, 2001). 
Light curves were downloaded from the ACVS^^, and 
imported into our DotAstro.org (http : //dotastro . org) 
astronomical light-curve warehouse for visualization and 
used with internal frameworks (Brewer et al. 2009). The 
ACVS provides V^-band measurements for a set of 50,124 
pre-selected ASAS variables, measured in five different 
apertures of varying size (Pojmahski 2002). For each star 
in the catalog an optimal aperture selection procedure is 
used to determine the final light curve, as described in 

http : //www. astrouw. edu .pl/asas/?page=acvs 
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Fig. 1. — Optical finding charts of the newly discovered RGB stars 
and DYPers. Each finder is 5' X 5' with north up and cast to the 
left. The circles show the location of the targets and have r = 33'.'5 
which is the typical FWHM for ASAS images (Pojrnanski 1997). 
The large pixels on the ASAS camera result in PSFs that include 
the light from several stars, meaning that some ASAS light curves 
underestimate the true variability of the brightest star within the 
PSF. 

Richards et al. (2012b). The ASAS-3 F-band hght 
curves for the eight new RGB stars and DYPers are 
shown in Figure 3. 

3.2. Spectroscopy 

Optical spectra of the candidate RGB stars were ob- 
tained between 2011 Sep. and 2012 May with the Kast 
spectrograph on the Lick 3-m Shane telescope on Mt. 
Hamilton, Galifornia (Miller & Stone 1993), the Low- 
Resolution Imaging Spectrometer (LRIS) on the 10-m 

These optimal aperture light curves can be obtained from 
DotAstro.org. 



Keck I telescope on Mauna Kea (Okc et al. 1995), and 
the RG Spectrograph on the SMARTS 1.5-m telescope at 
the Gerro Tololo Inter-American Observatory (Subasav- 
age et al. 2010). All spectra were obtained via long slit 
observations, and the data were reduced and calibrated 
using standard procedures (e.g., Matheson et al. 2000; 
Silverman et al. 2012). On each night of observations, we 
obtained spectra of spectrophotometric standards to pro- 
vide relative flux calibration for our targets. For queue- 
scheduled observations on the RG spectrograph, all ob- 
servations in a single night are conducted with the slit 
at the same position angle. Thus, the standard stars 
and targets were not all observed at the parallactic an- 
gle, leading to an uncertain flux correction (Filippenko 
1982). We note, however, that the uncertainty in the 
flux correction does not alter any of the conclusions dis- 
cussed below. A summary of our observations is given in 
Table 3, while the blue portion of the optical spectra are 
shown in Figures 4-5. 

4. ANALYSIS 
4.1. Spectroscopic Confirmation 

While the unique photometric behavior of RGB stars 
makes them readily identifiable in well sampled light 
curves taken over the course of several years, there are 
several examples of high-amplitude variables being classi- 
fied as RGB stars which are later refuted by spectroscopic 
observations. Most of the misidentified candidates are 
either cataclysmic, symbiotic or semi-regular variables 
(see e.g., Lawson & Gottrell 1990; Tisserand et al. 2008). 
RGB stars are a subclass of the HdG stars. For an RGB 
candidate to be confirmed as a true member of the class, 
its spectrum must show the two prominent features of 
HdG stars: anomalously strong carbon absorption and a 
lack of atomic and molecular H features. 

To confirm the RGB candidates found in the AGVS, 
we obtained low-resolution spectra of the 15 candidates 
presented in § 2. Gandidates observable from the north- 
ern hemisphere were observed with Kast and LRIS, while 
those only accessible from the southern hemisphere were 
observed with the RG spectrograph. For some of the 
southern hemisphere targets very low resolution spectra 
were obtained first to confirm the presence of G2 before 
slightly higher resolution observations were obtained (see 
Table 3). 

We searched the spectra for the presence of strong 
carbon features, primarily G2 and GN, and a lack of 
Balmer absorption to confirm the RGB classification for 
the AGVS candidates. We find these characteristics in 
eight of the spectroscopically observed stars (see Fig- 
ures 4-5) , which we consider good RGB and DYPer can- 
didates as summarized in Table 4. The remaining candi- 
dates were rejected as possible RGB stars based on their 
spectra, which typically showed strong TiO and VO ab- 
sorption or clear evidence for H. The properties of the 
rejected candidates are summarized in Table 5. In the 
remainder of this paper we no longer consider these stars 
candidates and restrict our discussion to the eight good 
candidates listed in Table 4. 

In addition to the hallmark traits of overabundant car- 
bon and a lack of Balmer absorption, RGB stars show a 
number of other unique spectroscopic characteristics. In 
particular, they show a very high ratio of ^^C/^^C and no 
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Fig. 2. — Two dimensional cuts through the multi-dimensional feature space used to classify sources in version 2.3 of the MACC. Each 
panel shows the location of all sources in the MACC (black points), as well as the RCB stars in the training set (cyan triangles), newly 
discovered RCB stars (yellow stars), and new DYPers (orange circles). Also shown are cuts necessary to achieve ~80% (blue dashed line) 
and ~100% (red dashed line) RCB selection efficiency. Next to these lines are the total number of ACVS sources within the cut region, 
as well as the efficiency of recovering training set and new detections (shown in parenthesis), respectively. Upper left: amplitude versus 
period significance. Upper right: amplitude versus Xqso Falsa' Lower left: amplitude versus period. Also shown is the tight cluster of 
Mira variables (red points), defined here as all ACVS sources with P(Mira) > 0.7. Lower right: amplitude versus P(RCB). Note that these 
are highly covariant as P(RCB) is strongly dependent on amplitude, which is why the cuts presented are shown in a single dimension. The 
P(RCB) values for the stars used in the training set are taken from the CV experiment from § 2.2. 



evidence for G band absorption. To search for the pres- 
ence of ^^C, we examined the spectra for the A4744 band 
head of ^^C-'^^C, which is typically very weak or absent in 
the spectra of RCB stars. We find evidence for ^^C^'^C 
in ASAS 191909-1554.4, ASAS 162232-5349.2, ASAS 
065113-1-0222.1, and ASAS 182658-^0109.0 while ASAS 
162232-5349.2 shows possible evidence for the ^^C^^C 
band at A4752. The presence of ^'^C suggests that these 
four stars are likely DYPers. We consider these four 
stars closer analogs to DY Per and the DYPers found in 
the LMC and SMC (Alcock ct al. 2001; Tisserand et al. 
2009) than they are to classical RCB stars. One of the 
DYPers, ASAS 065113-f0222.1, shows weak evidence for 



CH A4300 (G band) absorption and possible evidence for 
H7, which is sometimes seen in the spectra of DYPers. 
We note that the signal-to- noise ratio (S/N) of all our 
spectra in the range between ~4300-4350 is relatively 
low, making definitive statements about the presence or 
lack of both CH and H7 challenging. Finally, we note 
that we see evidence for the Merrill-Sanford bands of 
SiC2 in three of our candidates: ASAS 162232-5349.2, 
ASAS 065113+0222.1, and ASAS 182658-f0109.0. To 
our knowledge this is the first identification of SiC2 in a 
DYPer spectrum, though the presence of this molecule 
should not come as a surprise as RCB stars are both C 
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Fig. 3. — ASAS V-band light curves of newly discovered RGB stars and DYPers. Note the differing magnitude ranges shown for each 
light curve. Spectroscopic observations confirm the top four candidates to be RGB stars, while the bottom four are DYPers. 



and Si rich (Clayton 1996). 

4.2. Photometric Behavior 

In addition to spectral differences, RGB stars and 
DYPers show some dissimilarities in their photometric 
evolution as well. The first order behavior is the same: 
both show deep, irregular declines in their light curves 
which can take anywhere from a few months to a several 
years to recover to maximum brightness. Beyond that 
generic behavior, however, the shape of the decline tends 
to differ: RGB stars show fast declines with slow recov- 



eries whereas DYPers tend to show a more symmetric 
decline and recovery. 

The photometric properties of our candidates, includ- 
ing decline rates for the most prominent and well sampled 
declines, are summarized in Table 4. As previously noted 
in the caption of Figure 1, the full amplitude of the vari- 
ations of these stars are likely underestimated due to the 
large PSF on ASAS images. This means that the decline 
rates should be treated as lower limits, since the true 
brightness of the star may be below that measured in 
a large aperture. Nevertheless, the decline rates for the 
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TABLE 3 

Log of Spectroscopic Observations. 
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UT Date 


Instrument'' 
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Res. 
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^ Kast: Kast spectrograph on Lick 3-m telescope. LRIS: low resolution imaging spectrometer on Kcck-I 10-m 

telescope. RC: RC spectrograph on the SMARTS 1.5-m telescope. 

b I 

The Kast spectrograph is a dual arm spectrograph with a resolution of ~4 A on the blue side, which is 

relevant for the spectra shown in Figure 4. The typical resolution on the red side is ~10 A. 

^ Exposure time for the blue arm of the spectrograph. Due to the red nature of the SED the exposure time 
for the red arm was shorter than the blue. 



TABLE 4 

Observational Properties of New RGB Stars and DYPers. 



Name 




Amag 


At 


dm/dt 


13c 


H/CH 


pulsations 


RCB/ 




(mag) 




(d) 


(mag day"-*-) 








DYPer 


ASAS 170541-2650.1 


11.9 


1.2 


31 


0.04 


Weak 4744? 


None 


Nb 


RCB 


ASAS 162229-4835.7 


10.8 


2.8 


83 


0.03 


None 


Weak H7?, Weak CH 


Y 


RCB 


ASAS 165444-4925.9 


11.8 


>1.6 


>48 


0.03 


None 


weak H7?, H/3? 


Y 


RCB 


ASAS 203005-6208.0 


13.2 


>1.4 


>30 


0.05 


None 


H7?, H/3? - blends 


Y 


RCB 


ASAS 191909-1554.4 


6.9 


1.0 


20 


0.05 


Y 


None 


Y 


DYPer 


ASAS 162232-5349.2 


11.5 


1.7 


<256 


>0.007 


Y 


None 


Y 


DYPer 


ASAS 065113-1-0222.1 


12.4 


1.0 


<140 


>0.007 


Y 


weak H7?, Weak CH 


Y 


DYPer 


ASAS 182658-1-0109.0 


12.1 


1.6 


960 


0.002 


Y 


None 


Y 


DYPer 



^ Observed quantity, not corrected for Galactic reddening. 

^ The period of ASAS observations covers very little time around maximum light, and as a result there is a relatively short period of data suitable for 
searching for pulsations. See § 4.3. 



TABLE 5 
Rejected RCB Candidates. 



Name 


Vmax 


Remarks 




(mag) 





ASAS 060105-1-1654.7 12.3 No C2; strong H, G band 

ASAS 073456-2250.1 12.8 C2; strong H emission 

ASAS 095221-4329.7 10.6 Strong TiO, VO; H? 

ASAS 153214-2854.4 12.3 No C2; Ha emission, strong G band; SRPV'' 

ASAS 180823-4439.8 12.1 Strong TiO, VO 

ASAS 185817-3543.8 10.9 Strong TiO, VO 

ASAS 194245-2137.0 12.5 Strong TiO, VO 



cd quantity, not corrected for Gala 
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Fig. 4. — Blue optical spectra of the new candidate RGB stars. For reference low-resolution spectra of NSV 11154, a cool RGB star, and 
DY Per obtained in early 2012 are shown in blue. The new RGB stars all show clear evidence for strong molecular carbon absorption and 
lack clear evidence for ^^C as the A4744 band of ^^C^^C is undetected in each. There is also a lack of evidence for strong H absorption as 
is expected for RGB stars. 
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Fig. 5. — Blue optical spectra of the four new DYPer candidates. For reference low-resolution spectra of NSV 11154, a cool RGB star, 
and DY Per obtained in early 2012 are shown in blue. The new DYPers show strong absorption from the carbon Swan bands and the 
A4744 band of ^^C^^C is clearly detected in each, similar to DY Per. The candidates also show a lack of clear evidence for H absorption. 
Strong absorption from the Merrill-Sanford bands of SiC2 is seen in three of the DYPers: ASAS 162232-5349.2, ASAS 065113+0222.1, 
and ASAS 182658+0109.0. 



four RGB stars are relatively fast and consistent with 
those given in Tisscrand ct al. (2009) for RGB stars in 
the MCs, ^0.04 mag day~^. The most telling feature of 
the light curves is the shape of the declines, however. For 
the four spectroscopic RGB stars, ASAS 170541-2650.1, 
ASAS 162229-4835.7, ASAS 165444-4925.9, and ASAS 
203005—6208.0 the declines are very rapid. While we 
do not detect ASAS 165444-4925.9 after its sharp de- 
cline on TJDsaSOOO, the other three show slow asymmet- 
ric recoveries to maximum light. The four spectroscopic 
DYPers generally show a slower decline with a roughly 
symmetric recovery, though we note that the full recov- 
eries of ASAS 065113+0222.1 and ASAS 162232-5349.2 
are not observed. 

4.3. Pulsations 

All RGB stars are variable near maximum light, with 
most and possibly all of the variations thought to be due 
to pulsation (Lawson et al. 1990; Clayton 1996). Typical 
periods are ~40-100 days, and the amplitudes are a few 
tenths of a magnitude. The pulsational properties of 
DYPers are not as well constrained, because the sample is 
both small and only recently identified. Each of the four 
DYPers identified in Alcock et al. (2001) shows evidence 
for periodic variability near maximum light, with typical 
periods of ~100-200 days. 



To search for the presence of pulsations in our can- 
didate RGB stars, we use a generalized Lomb-Scargle 
periodogram (Lomb 1976; Scargle 1982; Zcchmcister & 
Kiirstcr 2009) to analyze each star (see Richards et al. 
2011 for more details on our Lomb-Scargle periodogram 
implementation). Our analysis only examines data that 
are well separated from decline phases, and we focus on 
the portions of light curves where the secular trend is 
slowly changing relative to the periodic variability. For 
each star we simultaneously fit for the harmonic plus 
linear or quadratic long-term trend in the data. The 
frequency that produces the largest peak in the peri- 
odogram, after masking out the 1 day alias, is adopted 
as the best-fit period. 

We find evidence for periodicity in the light curves of 
each star, except for ASAS 162229-4835.7. For most of 
the observing window ASAS 162229—4835.7 was in or 
near a decline, and we predict that additional observa- 
tions of ASAS 162229-4835.7, be they historical or in 
the future, will show periodic variability near maximum 
light. The trend-removed, phase-folded light curves of 
the remaining seven stars are shown in Figure 6. Insets 
in each panel list the range of dates that were included in 
the Lomb-Scargle analysis, as well as the best fit period 
for the data. 

Some RGB stars are known to have more than a sin- 



12 



Miller et. al. 



gle dominant period (see Clayton 1996 and references 
therein). We find evidence for multiple periods in AS AS 
191909-1554.4, with periods of 120, 175, and 221 days 
that appear to change every ~l-2 years. Evidence for 
multiple periods also appears to be present in ASAS 
165444-4925.9. The best fit periods in this case are 27 
and 56 days, which differ by roughly a factor of two. The 
longer period may in this case simply be a harmonic of 
the shorter period. Finally, we note that the best fit 
period for ASAS 162232-5349.2, 359 days, is very close 
to one year, and it is possible that the data are beating 
against the yearly observation cycle. The folded light 
curve appears to traverse a full cycle over ~half the full 
phase cycle. The slight upturn in the folded data around 
phase 0.15 suggests that the true period is likely ~18G 
days, half the best fit period. 

4.4. Spectral Energy Distributions 

All RGB stars are known to have an infrared (IR) ex- 
cess due to the presence of circumstellar dust (Feast 1997; 
Clayton 1996), and all the known DYPers in the MCs 
also show evidence for excess IR emission (Tisscrand 
et al. 2009). To check for a similar excess in the new 
ACVS RGB stars and DYPers, broadband spectral en- 
ergy distributions (SEDs) were constructed with catalog 
data obtained from USNO-Bl (Monet et al. 2003), the 
Two Micron All Sky Survey (2MASS; Skrutskic et al. 
2006), the Wide-Field Infrared Survey Explorer {WISE, 
Wright et al. 2010), AKARI (Murakami et al. 2007), the 
Mid-course Space Experiment {MSX; MiU et al. 1994), 
and the Infrared Astronomical Satellite (IRAS; Neuge- 
bauer et al. 1984). The USNO-Bl catalog contains 
measurements made on digital scans of photographic 
plates corresponding roughly to the B, R, and / bands. 
Repeated B and R plates were taken typically more than 
a decade apart. To convert the five separate USNO-Bl 
magnitude measurements to the standard g'r'i' system 
of the Sloan Digital Sky Survey (SDSS; Fukugita et al. 
1996), we invert the filter transformations from (Monet 
et al. 2003; see also Scsar et al. 2006). The two measure- 
ments each for the g' and the r' band are then averaged 
to get the reported SDSS g' and r' magnitudes, unless 
the two measurements differ by > 1 mag, in which case it 
is assumed that the fainter observation occurred during a 
fading episode of the star. Then the final adopted SDSS 
magnitude is that of the brighter measurement. There 
is a large scatter in the transformations from USNO-Bl 
to SDSS (Monet et al. 2003; Sesar et al. 2006), which, 
leads us to adopt a conservative l-cr uncertainty of 40% 
in flux density on each of the transformed SDSS flux 
measurements. The 2MASS magnitude measurements 
are converted to fluxes via the calibration of Cohen ct al. 
(2003) and the WISE magnitudes are converted to fluxes 
via the calibration in Cutri et al. (2011). The remaining 
catalogs provide flux density measurements in Jy rather 
than using the Vega magnitude system. 

The full SEDs extending from the optical to the mid- 
infrared for each of the new RCB stars and DYPers 
are shown in Figure 7. All of the candidates but 
ASAS 203005-6208.0 saturate the Wl (3.5 urn) and 
W2 (4.6 ^m) bands of WISE, while of those all but 

Catalog data for each of these surveys can be found at: 
http: //irsa. ipac . caltech. edu/ 
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Fig. 6. — Folded light curves showing evidence for periodic vari- 
ability near maximum light in the new RCB stars and DYPer candi- 
dates. The folded light curves only display a portion of the ASAS 
observations as indicated in the legend for each source. ASAS 
165444-4925.9 and ASAS 191909-1554.4 show evidence for mul- 
tiple dominant periods, which are shown with blue and red points 
as indicated in their respective legends. 



ASAS 170541-2650.1 and ASAS 165444-4925.9 satu- 
rate W3 (11.6 Atm) as weU. ASAS 191909-1554.4 and 
ASAS 182658-t-0109.0 saturate aU three of the 2MASS 
filters and are the only candidates detected at either 
60 and/or 100 /xm by IRAS. ASAS 162229-4835.7 and 
ASAS 191909-1554.4 were the only candidates detected 
at 90 fim by AKARI 

Using RCB stars and DYPers in the MCs, Tisserand 
ct al. (2009) find that the RCB stars typically have 
SEDs with two distinct peaks, whereas DYPers typi- 
cally have a single peak. It is argued in Tisserand ct al. 
that the SEDs of both can be understood as emission 
from a stellar photosphere and surrounding dust shell; 
the cooler photospheric temperatures of DYPers are less 
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Fig. 7. — Optical through mid-infrared spectral energy distribu- 
tions for newly discovered RGB stars and DYPers. The observa- 
tions in the various passbands were not taken simultaneously, thus 
in some cases the lack of a smooth spectrum is likely the result of 
intrinsic variability. The reddening towards each star is uncertain. 



distinct relative to the dust emission leading to a sin- 
gle broad peak rather than two. We caution that the 
reddening toward each of the new Galactic candidates 
is unknown, which makes a detailed analysis of their 
SEDs challenging. Furthermore, the observations were 
not taken simultaneously in each of the various band- 
passes. Nevertheless, a few interesting trends can be 
gleaned from the data. The four stars that spectro- 
scopically resemble RGB stars, ASAS 170541-2650.1, 



ASAS 162229-4835.7, ASAS 165444-4925.9, and ASAS 
203005—6208.0 all show clear evidence for a mid- 
infrared excess relative to their optical brightness. The 
peak of emission from ASAS 191909-1554.4 and ASAS 
182658+0109.0 is not well constrained because they sat- 
urate the detectors between 1 and 6 /im, yet interest- 
ingly both show evidence for an infrared excess redward 
of 50 /im. This suggests that there might be some very 
cool {T < 100 K) dust in the circumstellar environment 
of these stars, which is observed in some of the bright, 
nearby RGB stars. For instance, Spitzer and Herschel 
observations of R GrB show evidence for a large, ~4 pc, 
cool, r ~ 25 K, and diffuse shell of gas that is detected in 
the far-IR (Glayton et al. 2011). ASAS 162232-5349.2 
and ASAS 065113-1-0222.1 show evidence for a single 
broad peak in their SED occurring around ^2 ^m, which 
is similar to the SEDs of the DYPers observed in the 
MGs. 

4.5. Near-infrared Variability 

An overlap in the survey fields between 2MASS and 
the DEep Near Infrared Souther Sky Survey (DENIS; 
Epchtcin et al. 1994) allows measurements of the NIR 
variability of four of the newly discovered RGB stars and 
DYPers. Photometric measurements from 2MASS and 
the two epochs of DENIS observations for these stars 
arc summarized in Table 6. Unfortunately the 2MASS 
and DENIS observations proceeded the ASAS monitor- 
ing, and so we cannot provide contextual information 
such as the state of the star (near maximum, on de- 
cline, during deep minimum, etc.) at the time of the 
NIR observations. To within a few tenths of a magni- 
tude, ASAS 162229-4835.7 is not variable between the 
2MASS and DENIS observations. On timescales of a few 
weeks to months both ASAS 162232-5349.2 and ASAS 
170541—2650.1 show evidence for variations > 1 mag in 
the NIR. Similar variations have been observed for sev- 
eral of the RGB stars and DYPers in the MGs (e.g., Tis- 
serand et al. 2004, 2009). The largest variations were 
observed in ASAS 203005-6208.0, which changed by ~4 
mag in the J band during the ~4 yr between the DENIS 
and 2MASS observations. ASAS 203005-6208.0 also 
shows a large variation between the DENIS / band mea- 
surement and the /-band measurement from USNO-Bl, 
with Am « 7 mag. This star is clearly a large-amplitude 
variable, which likely explains its unusual SED. In the 
DENIS observations, which provide simultaneous optical 
and NIR measurements, ASAS 203005-6208.0 is always 
fainter in the optical, suggesting that the unusual shape 
to its SED (see Figure 7) is the result of non-coeval ob- 
servations. 

5. INDIVIDUAL STARS 

We discuss the individual stars and whether they 
should be considered RGB stars or DYPers below. We 
also use SIMBAD to identify alternate names for these 
stars and previous studies in the literature (see also Ta- 
ble 2). 

5.1. ASAS 170541-2650.1 (GV Oph) 

This star was first identified as a variable source on 
Harvard photographic plates with the name Harvard 
Variable 4368, and was cataloged as a likely long period 
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TABLE 6 

2MASS AND DENIS NIR Measurements. 



Name 




Epoch2MASs'' 


-'2MASS^ 


-ff2MASS^ 


-'^s2MASS^ 


EpodiDENis'' 


-^DENIs'' 


-^DENIs'' 


-^^'sDENIs'^ 






(JD) 


(mag) 


(mag) 


(mag) 


(JD) 


(mag) 


(mag) 


(mag) 


ASAS 170541- 


2650.1 


2451004.658 


10.01 


9.28 


8.46 


2451730.639 


11.09 


10.18 


9.13 














2451749.592 


9.94 


8.91 


7.96 


ASAS 162229- 


-4835.7 


2451347.541 


7.26 


6.47 


5.73 


2451387.540 


9.10 


6.90 


5.40 














2451395.492 


8.98 


7.13 


5.45 


ASAS 162232- 


-5349.2 


2451347.538 


7.24 


6.01 


5.36 


2451387.532 


9.12 


6.59 


4.40 














2451395.485 


9.20 


6.30 


4.28 


ASAS 203005- 


-6208.0 


2451701.878 


11.73 


11.20 


10.40 


2450267.768 


17.62 


15.66 


12.06 














2451003.746 


15.82 


14.37 


11.89 



Note. — 

^ Catalog measurement from the 2MASS point sourec eatalog (Cutri et al. 2003). 

^ Catalog measurement from the DENIS point sourec eatalog (Epehtein et al. 1999). 

variable based on the large amplitude of variations from 
13.9 mag to below the photographic limit of ~16.5 mag 
(Swopc 1928). It was later named GV Oph in the Gen- / 
eral Catalog of Variable Stars (GCVS; Kukarkin et al. 
1971; Samus et al. 2008) as a variable of unknown type ^ 
with rapid variations. The light curve, spectrum, and ^ 
SED of this star are consistent with it being an RGB ^ 
star. p 

5.2. ASAS 162229-4835.7 (10 Nor) ^ 

This star is listed as a Mira variable in the GCVS with t' 

the name 10 Nor. In Clarke et al. (2005) it is identified It 

as a star with an IR excess based on MSX observations. li 
On the basis of its NIR and mid-IR colors, it is identified 
as a RGB candidate in Tisserand (2012), and considered 
a likely RGB star on the basis of its ASAS light curve. 

We independently identified 10 Nor as a likely RGB star s" 

on the basis of its light curve (in the MACG it is the most [i 

likely RGB in ACVS), and our spectrum confirms that it it 

is a genuine RGB star. The previous classification as a c 

Mira variable is likely based on the late spectral type and ii 

large amplitude of variability, but Figure 3 clearly shows it 

that ASAS 162229-4835.7 is not a long period variable. b 

5.3. ASAS 165444-4925.9 (C* 2311) s; 

The variability of this star has not been cataloged to 
date, and it is hsted in the GGGS as G* 2377 (Alksnis * 
et al. 2001). The spectrum, SED, and pulsations exhib- 1' 
ited by this star are consistent with RGB stars. There 
may be evidence for weak GH absorption, though we cau- 
tion that the S/N is low near -4300 A. The light curve ( 
shows a sharp decline, similar to RGB stars, but the re- 
covery is not observed. Nevertheless, the evidence points 
to ASAS 165444-4925.9 being a new member of the RGB 
class. 

ti 

In Tisserand (2012) two additional stars with IR colors con- ^ 
sistent with RGB stars, V653 Sco and V581 CrA, are identified as O 
highly likely RGB stars on the basis of their ASAS light curves. e: 
V581 GrA is not included in AGVS and therefore is not included 
in the MAGG. V653 Sco is listed in the GGVS as a Mira variable 
and classified as a semi-regular periodic variable in the MAGG, Y 
P(SRPV) = 0.55. It has P(RGB) = 0.012 and i?(RGB) = 2609. n 
The light curve is somewhat similar to ES Aql, in that it fades Jj 
below the ASAS detection limits and it is highly active during the ^ 
Ri8 yr it was observed, meaning it that folds decently well on a 
period of ~450 days. A spectrum will be needed to disambiguate 
between an RGB and long period variable classification for V653 C 
Sco. t- 



5.4. ASAS 203005-6208.0 (NSV 13098) 

This star was first identified as variable by Luyten 
(1932) with a maximum brightness of 14 mag and a 
minimum > 18 mag. Luyten assigned it the name 
AN 141.1932, and it was later cataloged as a possible 
variable star in the GGVS as NSV 13098. The light 
curve, spectrum, and SED are all consistent with an 
RGB classification. Higher resolution and S/N spectra 
are needed to confirm if H absorption is present, though 
we note that some RGB stars do show evidence of H in 
their spectra (e.g., V854 Gen; Kilkenny & Marang 1989), 
leading us to conclude that ASAS 203005-6208.0 is an 
RGB star. 

5.5. ASAS 191909-1554.4 (V1942 Sgr) 

This star is listed as a slow irregular variable of late 
spectral type with the name V1942 Sgr in the GGVS. It 
is the brightest star among our candidates, and as such 
it is one of the best studied carbon stars to date. Ac- 
cording to SIMBAD ASAS 191909-1554.4 is discussed 
in more than 50 papers in the literature. In the GGGS 
it is listed as G* 2721. ASAS 191909-1554.4 is detected 
by Hipparcos (HIP 94940) and has a measured parallax 
of 2.52±0.82 mas (Ferryman et al. 1997). This corre- 
sponds to a distance d — 397 ± 115 pc and a distance 
modulus n M 8 mag. ASAS 191909-1554.4 is one of 
the few Galactic carbon stars with a measured paral- 
lax, and it is important for constraining the luminosity 
function of carbon stars (Wallerstein & Knapp 1998). 
In their spectral atlas of carbon stars, Barnbaum et al. 
(1996) identify ASAS 191909-1554.4 as having a spectral 
type of N5-I- G25.5. Relative abundance measurements 
by Abia&Isern (1997) show that 12C/13G = 30, which 
is low relative to classical RGB stars. The proximity of 
ASAS 191909-1554.4 allows its circumstellar dust shell 
to be resolved in IRAS images (e.g., Young et al. 1993), 
and Egan & Leung (1991) use ASAS 191909-1554.4 and 
other carbon stars with resolved dust shells and 100 iim 
excess to statistically argue that each of these stars must 
be surrounded by two dust shells, one that is old, —lO'' 
yr, and the other that is produced by a current episode of 
mass loss. Recent H I observations by Libert et al. (2010) 
have shown evidence for the presence of H in the cir- 
cumstellar shell of ASAS 191909-1554.4. The shallow, 
symmetric fade of the light curve, along with the N type 
carbon star spectrum and the presence of ^'^G in the spec- 
trum, leads us to conclude that ASAS 191909-1554.4 is 
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a DYPer. This is the only candidate within our sample 
for which we can measure the absolute magnitude, since 
we have the Hipparcos parallax measurement. Adopting 
a maximum light brightness of T^nax = 6.8 mag, we find 
that ASAS 191909-1554.4 has My « -1.2 mag. This is 
roughly 0.4 mag fainter than the faintest DYPers in the 
MCs (Tisserand et al. 2009), suggesting that either the 
luminosity function extends fainter than that observed 
in the MCs or there is unaccounted for dust extinction 
toward ASAS 191909-1554.4. 

5.6. ASAS 162232-5349.2 (C* 2322) 

The variability of this star has not been cataloged to 
date, and it is listed in the CGCS as C* 2322. The 
relatively slow, symmetric decline and recovery in the 
light curve of ASAS 162232-5349.2 lead us to classify it 
as a DYPer variable. The presence of ^"^C in the spectrum 
and the single peak in the SED support this classification. 

5.7. ASAS 065113+0222.1 (C* 596) 

The variability of this star has not been cataloged to 
date, and it is listed in the CGCS as C* 596. The pres- 
ence of ^'^C in the spectrum and the single peak in the 
SED lead us to classify ASAS 065113+0222.1 as a DYPer 
variable. 

5.8. ASAS 182658+0109.0 (C* 2586) 

The variability of this star has not been cataloged to 
date, and it is listed in the CGCS as C* 2586. Based 
on weekly averages of DIRBE NIR observations taken 
over 3.6 yr. Price et al. (2010) list ASAS 182658+0109.0 
as a non-variable source. Low resolution spectra taken 
with IRAS show 11 /im SiC dust emission, which typi- 
cally indicates significant mass loss from a carbon star 
(Kwok ct al. 1997). The light curve shows a ~5-6 yr 
symmetric decline, and there is clear evidence for ^'^C 
in the spectrum. The IRAS detection at 60 /xm shows 
a clear IR excess relative to a single temperature black- 
body. While there is no evidence for Ha, the S/N in 
our spectrum is low blueward of ^4700 A. We consider 
ASAS 182658+0109.0 a likely DYPer, though higher S/N 
spectra are required for a detailed abundance analysis to 
confirm this classification. 

6. DISCUSSION 
6.1. New Candidates from an Expanded Training Set 

As mentioned in § 2.1, one of the major strengths of 
ML classification is that new discoveries may be fed back 
into the machinery in order to improve future iterations 
of the classifier. In an attempt to recover more RGB 
stars and DYPers in the ACVS that were missed in our 
initial search of the MAGG, we created an augmented 
RGB training set by adding the eight new RGB stars 
and DYPers identified in this paper to the 17 sources 
already included in the training set. This augmented 
training set should increase the likelihood of discovering 
new candidates, particularly DYPers, of which there were 
no examples in the original training set. 

Using the augmented training set we re-ran the RF 
classifier from Richards ct al. (2012b) on all the ACVS 
light curves to search for any additional good candidates. 
We focus our new search on candidates with a significant 



change in i?(RCB), which were not examined in the ini- 
tial search of the MAGG. In particular, we visually exam- 
ine the light curves of all sources with i?(RCB)now < 500, 
P(RCB)macc < 0.1, and i?(RCB)„ew < i?(RCB)MAcc- 
There are a total of 96 sources that meet these criteria, 
which were not included in the 472 visually inspected 
sources from the original MAGG. Of these 96, we con- 
servatively select six as candidate RGB stars or DYPers. 
One is a highly likely RGB star with multiple declines 
and asymmetric recoveries, three show evidence for a sin- 
gle decline which is only partially sampled, and two are 
known carbon stars that are likely semi-regular periodic 
variables. We list the candidates in Table 7 with brief 
comments on each and show their light curves in Fig- 
ure 8. 

6.2. Future Improvements to the Classifier 

Restricting our search for bright RGB stars to only 
those sources in the ACVS has biased the results of our 
search. As was mentioned in § 2.2, there are seven known 
RGB stars that show clear variability in their ASAS light 
curves yet were not selected for inclusion in the ACVS. 
This suggests that several large-amplitude ASAS vari- 
ables are missing from the ACVS, presumably including 
a few unknown RGB stars. This bias can easily be cor- 
rected by searching all of ASAS for RGB stars, however, 
such a search would include significant new challenges 
as there are ~12 million sources in ASAS. In addition, 
our existing classification framework is not designed to 
deal with a catalog where the overwhelming majority of 
sources are not in fact variable. Nevertheless, both of 
these challenges must be addressed prior to the LSST 
era. We have developed frameworks that can ingest mil- 
lions of light curves and are currently experimenting with 
methods to deal with non- variable sources, the results 
of which will be presented in a catalog with classifica- 
tions for all ~12 million sources in ASAS (Richards et al., 
in prep). Furthermore, it has been shown that the use 
of mid-infrared colors is a powerful discriminant when 
trying to select RGB stars (Tisserand et al. 2011; Tis- 
serand 2012). While our use of NIR colors is important 
for selecting RGB stars (see e.g., Soszynski ct al. 2009), 
adding the mid-infrared measurements from the all-sky 
WISE survey will dramatically improve our purity when 
selecting RGB candidates as several of the Mira and semi- 
regular variables that served as interlopers in the current 
search (§ 2.3) would be eliminated with the use of mid- 
infrared colors. 

7. CONCLUSIONS 

We have used the 71-feature Random Forest machine- 
learning ACVS classification catalog from Richards ct al. 
(2012b) to identify likely DYPers and RGB stars in the 
ACVS catalog. The RF classifier provides several ad- 
vantages over previous methods to search time-domain 
survey data for RGB stars and DYPers. Previously suc- 
cessful searches for RGB stars have developed a method- 
ology focused on large amplitude variables that do not 
show strong evidence for periodicity (e.g., Alcock et al. 
2001; Zanicwski ct al. 2005; Tisserand ct al. 2008). While 
the RF classifier is capable of capturing the large varia- 
tions and irregular declines observed in RGB stars, the 
use of many features allows complex behavior, such as 
the shape of the decline and recovery, to be captured as 
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TABLE 7 

New RCB/DYPer candidates using an augmented training set. 



Name 




Other ID 


DotAstro^ 


aj2000.o'' 


<5.I2000.o'^ 


CGCS<^ 


P(RCB)d 


R(RCB)° 


Remarks 








ID 


(hh mm ss.ss) 


(dd mm ss.s) 


ID 


new 


new 




ASAS 0533024 


-1808.0 


IRAS 053014-1805 


219583 


05 33 01.72 


18 07 59.0 


980 


0.339 


380 


1 


ASAS 081121- 


3734.9 


C* 1086 


227950 


08 11 21.39 


-37 34 54.2 


2106 


0.145 


492 


1 


ASAS 125245- 


5441.6 




237449 


12 52 44.92 


-54 41 37.5 




0.309 


394 


2 


ASAS 160033- 


-2726.3 


IRXS J160033.8-272614? 


243486 


16 00 33.16 


-27 26 18.5 




0.142 


498 


2,3 


ASAS 175226- 


3411.5 


IRAS 17491-3410 


249729 


17 52 25.50 


-34 11 28.2 




0.166 


460 


ASAS 2005314 


-0427.2 


V902 Aql 


259768 


20 05 30.83 


04 27 12.8 




0.382 


353 


2,4 



Note. — Remarks: 1. Known carbon star, shows evidence for multiple declines that may be periodic. Semi-regular variable? 2, Shows evidence for a single partially resolved 
decline or recovery. S. This star is r^9" from the cataloged X-ray source IRXS J 160033.8 — 2726 14. The possible association with an X-ray counterpart suggests that it may be a 
Be star. 4. This star is listed as having an M spectral type in the GCVS. We were unable to find a published reference listing this spectral type and suggest a new spectrum be 
taken to determine the spectral type. 

^ ID from the MACC. 

^ Reported coordinates from the Two Micron All Sky Survey point source catalog (Cutri et al. 2003). 
^ ID from the General Catalog of Galactic Carbon Stars (CGCS; Alksnis et al. 2001). 
*^ Probability of belonging to the RCB class when using the augmented training set. 
^ Relative rank of f{RCB) when using the augmented training set. 
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Fig. 8. — ASAS V-band light curves of new RGB candidates found using an augmented training set as described in § 6.1. 
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well. Another advantage of RF classification is that it 
does not require hard cuts on any individual light curve 
feature, which can exclude real RGB stars from the final 
candidate selection. There are a total of 472 stars with 
P(RCB) > 0.1 in version 2.3 of the MACC, 15 of which 
were selected as good RGB or DYPer candidates after 
visual inspection and existing spectroscopic information. 

Following spectroscopic observations eight of the good 
candidates were identified as bona fide RGB stars or 
DYPers. Four of these, ASAS 170541-2650.1 (GV Oph), 
ASAS 162229-4835.7 (10 Nor), ASAS 165444-4925.9, 
and ASAS 203005-6208.0 were confirmed as new RGB 
stars on the basis of (i) their light curves showing ir- 
regular, sharp declines of large amplitude {Amy 3> 1 
mag), (ii) carbon rich spectra showing a lack of evi- 
dence for H and ^'^G, and (iii) the mid-infrared excess 
observed in their SEDs. Four of the candidates, ASAS 
191909-1554.4 (V1942 Sgr), ASAS 065113-1-0222.1, 
ASAS 162232-5349.2, and ASAS 182658-f 0109.0 appear 
to be Galactic DYPers on the basis of (i) shallow, sym- 
metric declines in their light curves occurring at irregu- 
lar intervals, (ii) carbon rich spectra resembling carbon 
N stars showing ^'^G and weak or no H, and (iii) SEDs 
that show a single peak, but which are too broad to be 
explained via a single temperature blackbody. With the 
exception of ASAS 170541-2650.1, all of the new can- 
didates show evidence for periodic variability near max- 
imum light. We incorporate the newly confirmed RGB 
stars and DYPers into the training set to identify six new 
candidates as likely RGB stars. 

Our effort has increased the number of known Galactic 
DYPers from two to six. While the sample size is small, 
it appears that DYPers have pulsations with period P > 
100 days at maximum light, which is longer than the 
typical timescale for pulsation in RGB stars (see also 
Alcock et al. 2001). Each of the new RGB stars and 
DYPers is bright, Vmax ^12 mag, which will enable high- 
resolution spectroscopy for future studies of the detailed 
abundances of these stars. This is particularly important 
in the case of the four new DYPers, as DY Per itself is 
the only member of the class which has been observed at 
high resolution to confirm the lack of H absorption in the 
spectrum (Keenan & Barnbaum 1997; Zacs et al. 2007). 
If these stars are shown to be H deficient, it would be 
strong evidence that DYPers are the cool (Tcff ^3500 K) 
analogs to RGB stars. 

We view the results presented herein as one culmina- 
tion of a broader effort to extract novel science from 
the time-domain survey data deluge. Earlier work fo- 
cused on determining the most suitable ML frameworks 
for classification and subsequent classification efficiency 
(see Bloom & Richards 2011 for review). While pro- 
duction of ML-based catalogs (e.g., AGVS; Pojmahski 
2000) have been the norm for over a decade, we know of 
no concerted effort to validate the predictions of those 
catalogs. Now having a probabilistic catalog of variable 
sources (Richards et al. 2012b) to work with, we can se- 
lect our demographic priors on classes of interest and 
also decide just how many false-positives we are willing 
to tolerate in the name of improved efficiency. In the 
case of the construction of a new set of very common 
stars (e.g., RR Lyrae catalog), we might be willing to 
tolerate a reduced discovery efficiency to preserve a high 



level of purity. Management of the available resources to 
follow-up the statements made in a probabilistic catalog 
becomes the next challenge. We were obviously most in- 
terested in finding new exemplars of two rare classes and 
thus tolerated a high impurity. In the discovery and char- 
acterization of several bright RGB stars and DYPers, the 
payoff of the efficient use of follow-up resources enabled 
by probabilistic classification is evident. 

The classification taxonomy of variable stars clearly 
conflates phenomenology (e.g., "periodic") within a 
physical understanding ("pulsating") of the origin of 
what is observed. And while phenomenologically based 
mining around an envelope of class prototypes can turn 
up new class members, we have shown that the diver- 
sity of RGB stars and DYPers demands an expanded 
approach to discovery. We speculate that the richness 
and connections of the feature set in the ML search may 
be also capturing some of the phenomenological manifes- 
tations of the underlying physics, however nuanced, that 
we cannot (yet) express. 
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